Context-aware chatbot system and method

ABSTRACT

A context-aware chatbot method and system are provided. The context-aware chatbot method comprises receiving a user&#39;s voice; converting the user&#39;s voice to a question to be answered; determining a question type of the question to be answered; generating at least one answer to the question based on a context-aware neural conversation model; validating the answer generated by the context-aware neural conversation model; and delivering the answer validated to the user. The context-aware neural conversation model takes contextual information of the question into consideration, and decomposes the contextual information of the question into a plurality of high dimension vectors.

FIELD OF THE INVENTION

The present invention relates generally to the field of computertechnologies and, more particularly, to a context-aware chatbot systemand method.

BACKGROUND

As E-commerce is emerging, successful information access on E-commercewebsites, which accommodate both customer needs and businessrequirements, becomes essential and critical. Menu driven navigation andkeyword search provided by most commercial sites have tremendouslimitations, as they tend to overwhelm and frustrate users with lengthyand rigid interactions. User interest in a particular site oftendecreases exponentially with the increase in the number of mouse clicks.Thus, shortening the interaction path to provide useful informationbecomes important.

Many E-commerce sites attempt to solve the problem by providing keywordsearch capabilities. However, keyword search engines usually requireusers to know domain-specific jargon. Unfortunately, keywords searchdoes not allow users to precisely describe the user intention, and moreimportantly, keyword search lacks an understanding of the semanticmeanings of the search words and phrases. For example, keyword searchengines usually may not understand that “summer dress” should be lookedup in women's clothing under “dress”, whereas “dress shirt” most likelyin men's under “shirt”. A search for “shirt” often reveals dozens oreven hundreds of items, which are useless for somebody who has aspecific style and pattern in mind.

Given the abovementioned limitations, a current solution is naturallanguage (and multimodal) dialog, namely chatbot. Chatbot has been usedin a large variety of fields, such as call-center/routing applications,e-mail routing, information retrieval and database access, and telephonybanking, etc. Recently, chatbot has become even more popular with theaccess to a large number of user data.

However, according to the present disclosure, existing chatbottechnologies are often restricted to specific domains or applications(e.g., booking an airline ticket) and require handcrafted rules.Furthermore, in a real dialogue between a user and a robot, user'scontext could be substantially complex and continuously changed. Thus,context-aware and proactive technologies are highly desired to beincorporated into a chatbot system.

The disclosed methods and systems are directed to solve one or moreproblems set forth above and other problems.

BRIEF SUMMARY OF THE DISCLOSURE

One aspect of the present disclosure includes a context-aware chatbotmethod. The context-aware chatbot method comprises receiving a user'svoice; converting the user's voice to a question to be answered;determining a question type of the question to be answered; generatingat least one answer to the question based on a context-aware neuralconversation model; validating the answer generated by the context-awareneural conversation model; and delivering the answer validated to theuser. The context-aware neural conversation model takes contextualinformation of the question into consideration, and decomposes thecontextual information of the question into a plurality of highdimension vectors.

One aspect of the present disclosure includes a non-transitorycomputer-readable medium having computer program for, when beingexecuted by a processor, performing a context-aware chatbot method basedon multimodal deep neural network. The method comprises. Thecontext-aware chatbot method comprises receiving a user's voice;converting the user's voice to a question to be answered; determining aquestion type of the question to be answered; generating at least oneanswer to the question based on a context-aware neural conversationmodel; validating the answer generated by the context-aware neuralconversation model; and delivering the answer validated to the user. Thecontext-aware neural conversation model takes contextual information ofthe question into consideration, and decomposes the contextualinformation of the question into a plurality of high dimension vectors.

One aspect of the present disclosure includes a context-aware chatbotsystem. The context-aware chatbot system comprises a questionacquisition module configured to receive a user's voice and convert theuser's voice to a question to be answered; a question determinationmodule configured to determine a question type of the question to beanswered; a context-aware neural conversation module configured togenerate at least one answer to the question by taking contextualinformation of the question into consideration and decomposing thecontextual information of the question into a plurality of highdimension vectors; an evidence validation module configured to validatethe answer generated by the context-aware neural conversation model; andan answer delivery module configured to deliver the answer validated tothe user.

Other aspects of the present disclosure can be understood by thoseskilled in the art in light of the description, the claims, and thedrawings of the present disclosure.

BRIEF DESCRIPTION OF THE DRAWINGS

The following drawings are merely examples for illustrative purposesaccording to various disclosed embodiments and are not intended to limitthe scope of the present disclosure.

FIG. 1 illustrates an exemplary environment incorporating certainembodiments of the present invention;

FIG. 2 illustrates an exemplary computing system consistent withdisclosed embodiments;

FIG. 3 illustrates an exemplary context-aware chatbot system consistentwith disclosed embodiments;

FIG. 4 illustrates a flow chart of an exemplary context-aware chatbotmethod consistent with disclosed embodiments; and

FIG. 5 illustrates an examplary context-aware neural conversationalmodel consistent with disclosed embodiments.

DETAILED DESCRIPTION

Reference will now be made in detail to exemplary embodiments of theinvention, which are illustrated in the accompanying drawings.Hereinafter, embodiments consistent with the disclosure will bedescribed with reference to drawings. Wherever possible, the samereference numbers will be used throughout the drawings to refer to thesame or like parts. It is apparent that the described embodiments aresome but not all of the embodiments of the present invention. Based onthe disclosed embodiments, persons of ordinary skill in the art mayderive other embodiments consistent with the present disclosure, all ofwhich are within the scope of the present invention.

Chatbot systems are paramount for a wide range of tasks in enterprise. Achatbot system has to communicate clearly with its suppliers andpartners, and engage clients in an ongoing dialog, not merelymetaphorically but also literally, which is essential for maintaining anongoing relationship. Communication characterized by information-seekingand task-oriented dialogs is central to five major families of businessapplications: customer service, help desk, website navigation, guidedselling, and technical support.

Customer service responds to customers' general questions about productsand services, e.g., answering questions about applying for an automobileloan or home mortgage. Help desk responds to internal employeequestions, e.g., responding to HR questions. Website navigation guidescustomers to relevant portions of complex websites. A “Websiteconcierge” is invaluable in helping people determine where informationor services reside on a company's website. Guided selling providesanswers and guidance in the sales process, particularly for complexproducts being sold to novice customers. Technical support responds totechnical problems, such as diagnosing a problem with a device.

In commerce, clear communication is critical for acquiring, serving, andretaining customers. Companies often educate their potential customersabout their products and services and, meanwhile, increase customersatisfaction and customer retention by developing a clear understandingof their customers' needs. However, customers are often frustrated byfruitless searches through websites, long waiting in call queues tospeak with customer service representatives, and delays of several daysfor email responses. Thus, correct and prompt answers to customers'inquiries are highly desired.

The existing chatbot systems focus on training the question-answer pairsand recommending the most likely response to individual users, while nottaking any contextual information into consideration. Contextualinformation refers to information relevant to an understanding of thetext, for example, the identity of things named in the text: people,places, books, etc., information about things named in the text: birthdates, geographical locations, date published, etc., interpretiveinformation: themes, keywords, and normalization of measurements, dates,etc.

That is, traditionally chatbot systems only deal with users andconversations, but do not embed the conversation into a context whenresponding to the users. Considering only users and conversations may beinsufficient for many applications. For example, using the temporalcontext, a travel conversational system would provide a vacationrecommendation in the winter which may be very different from the one inthe summer. Similarly, in a consumer conversational system, it isimportant to determine what content and when to be delivered to acustomer. Thus, incorporating the contextual information in theconversational system to response to users in certain circumstances arehighly desired.

Mapping sequences to sequences based on neural networks has been usedfor neural machine translation, improving English-French andEnglish-German translation task. Because vanilla recurrent neuralnetworks (RNNs) suffer from vanishing gradients, variants of the LongShort Term Memory (LSTM) recurrent neural network may be adopted.Besides, bots and conversational agents have been proposed. However,most of these systems require a rather complicated processing pipelineof many stages, and the corresponding methods do not consider thechanges in the user's context.

The present disclosure provides a context-aware chatbot method based ona neural conversational model, which may take contextual features intoconsideration. The neural conversational model may be trained end-to-endand, thus, may require significantly fewer handcrafted rules. Thedisclosed context-aware chatbot method may incorporate contextualinformation in a neural conversational model, which may enable a chatbotto be aware of context in a communication with the user. A contextualreal-valued input vector may be provided in association with each wordto simplify the training process. The vector learned from the contextmay be used to convey the contextual information of the sentences beingmodeled.

FIG. 1 illustrates an exemplary environment 100 incorporating certainembodiments of the present invention. As shown in FIG. 1, theenvironment 100 may include a user terminal 102, a server 104, a user106, and a network 110. Other devices may also be included.

The user terminal 102 may include any appropriate type of electronicdevice with computing capabilities, such as a wearable device (e.g., asmart watch, a wristband), a mobile phone, a smartphone, a tablet, apersonal computer (PC), a server computer, a laptop computer, and adigital personal assistant (PDA), etc.

The server 104 may include any appropriate type of server computer or aplurality of server computers for providing personalized contents to theuser 106. For example, the server 104 may be a cloud computing server.The server 104 may also facilitate the communication, data storage, anddata processing between the other servers and the user terminal 102. Theuser terminal 102, and server 104 may communicate with each otherthrough one or more communication networks 110, such as cable network,phone network, and/or satellite network, etc.

The user 106 may interact with the user terminal 102 to query and toretrieve various contents and perform other activities of interest, orthe user may use voice, hand or body gestures to control the userterminal 102 if speech recognition engines, motion sensor ordepth-camera is used by the user terminal 102. The user 106 may be asingle user or a plurality of users, such as family members.

The user terminal 102, and/or server 104 may be implemented on anyappropriate computing circuitry platform. FIG. 2 shows a block diagramof an exemplary computing system capable of implementing the userterminal 102, and/or server 104.

As shown in FIG. 2, the computing system 200 may include a processor202, a storage medium 204, a display 206, a communication module 208, adatabase 214, and peripherals 212. Certain components may be omitted andother components may be included.

The processor 202 may include any appropriate processor or processors.Further, the processor 202 can include multiple cores for multi-threador parallel processing. The storage medium 204 may include memorymodules, such as ROM, RAM, flash memory modules, and mass storages, suchas CD-ROM and hard disk, etc. The storage medium 204 may store computerprograms for implementing various processes, when the computer programsare executed by the processor 202.

Further, the peripherals 212 may include various sensors and other I/Odevices, such as keyboard and mouse, and the communication module 208may include certain network interface devices for establishingconnections through communication networks. The database 214 may includeone or more databases for storing certain data and for performingcertain operations on the stored data, such as database searching.

Returning to FIG. 1, the user terminal 102 and the server 104 may beimplemented with a context-aware chatbot system. FIG. 3 illustrates anexemplary context-aware chatbot system. As shown in FIG. 3, thecontext-aware chatbot system 300 may include a question acquisitionmodule 301, a question determination module 302, a context-aware neuralconversation module 303, an evidence validation module 304, and ananswer delivery module 305.

The question acquisition module 301 may be configured to receive auser's question. The user's questions may be received in various ways,for example, text, voice, sign language. In one embodiment, the questionacquisition module 301 may be configured to receive a user's voice andconvert the user voice to a corresponding question, for example, withthe help of speech recognition engines.

The question determination module 302 may be configured to analyze thequestion and determine a question type. Analyzing the question may referto deriving the semantic meaning of that question (what the question isactually asking). The question determination module 302 may beconfigured to analyze the question through deriving how many parts ormeanings are embedded in the question. Features of questions may belearned for a question-answer matching.

In particular, the question determination module 302 may be configuredto identify Lexical Answer Type (LAT). A lexical answer type is a wordor noun phrase in the question that specifies the type of the answerwithout any attempt to understand its semantics. Determining whether ornot a candidate answer can be considered an instance of the LAT is animportant kind of scoring and a common source of critical errors. Forexample, given a question “recommend me some restaurant?”, the questionanalysis module 302 may be configured to analyze the syntax of thesentence and infer that the question is asking for a place.

The context-aware neural conversation module 303 may be configured togenerate answers to the question and a sequence of answers to thequestion based on a context-aware neural conversation model, i.e., usethe data from the question analysis to generate candidate answers. Inparticular, when a question is received, the context-aware neuralconversation module 303 may be confiugred to recognize the contextualinformation of the question even the context is not appeared. Forexample, the context-aware neural conversation module 303 may beconfigured to add time, and event, etc., as input into the context-awareneural conversational model.

Moreover, the context-aware neural conversation module 303 may beconfigured to infer answers to questions even if the evidence is notreadily present in the training set, which may be important because thetraining data may not contain explicit information about every attributeof each user. The context-aware neural conversation module 303 may beconfigured to learn event representations based on conversationalcontent produced by different events, in which events producing similarresponses may tend to have similar embeddings. Thus, the training datanearby in the vector space may increase the generalization capability ofthe context-aware neural conversation model.

The evidence validation module 304 may be configured to validate theanswer generated by the context-aware neural conversation module 303.Although the answers are generated, the user may not accept the answer.Thus, evidence validation module 304 may be configured to calculate aconfidence score for quality control. In one embodiment, the confidencescore may be calculated in Kullback-Leibler distance between thequestion and the answer, and then normalized between 0 and 1.

For example, a predetermined confidence score may be provided as astandard, if the calculated confidence score is larger than thepredetermined confidence score, the corresponding answer may beconsidered as valid. The answer delivery module 305 may be configured todeliver the validated answer to the user. If the calculated confidencescore is smaller than the predetermined confidence score, thecorresponding answer may be considered as invalid. Then thecontext-aware neural conversation module 303 may generate a new answeruntil the answer is validated. In addition, the validated answers mayalso be used for training for the future questions.

The present disclosure also provides a context-aware chatbot method. Totake the contextual inforamiton into consideration, the context-awarechatbot method may model the response with context. Each event may berepresented as a vector for embedding, such that event information(e.g., weather, traffic) that influences the content and style ofresponses may be encoded. FIG. 4 illustrates a flow chart of anexemplary context-aware chatbot method consistent with disclosedembodiments.

As shown in FIG. 4, at the beginning, user's voice is received (S402).The user's voice may be in real time or may be recorded, and the user'svoice may be received by a microphone and then converted into a digitalformat or into a data file. The user's voice may also be received indata of digital format or in the form of data file. Any appropriatemethod may be used to receive the user data.

Further, and the user's voice is converted to a question to be answered(S404). That is, a question is issued by the user in his/her voice. Inone embodiment, the user's voice may be recognized into text and thequestion may be obtained by analyzing the text. Or the data of theuser's voice may be analyzed to obtain the question or questions. Inanother embodiment, the question to be answered may be received in otherways, for example, text, sign language, not only limited to voice.

Then, the question to be answered is analyzed to determine a questiontype (S406). For example, the question to be answered may be regardingtime, location or place, etc. The question to be answered may beanalyzed through deriving how many parts or meanings are embedded in thequestion to be answered. In one embodiment, the question type may bedetermined through identifying Lexical Answer Type (LAT). For example,given a question “recommend me some restaurant?”, the syntax of thesentence may be analyzed, and the question to be answered may beinferred as a question regarding a place.

After the question type is determined, at least one answer to thequestion are generated based on a context-aware neural conversationmodel (S408). That is, candidate answers may be generated based on thedata from the step S406. A sequence of answers to the question to beanswered may also be generated based on the context-aware neuralconversation model, in which the answers may be ranked in a certainorder, for example, an order of preference.

In particular, when a question is received by the context-aware neuralconversation model, the context-aware neural conversation model mayrecognize the contextual information even the context is not appeared.For example, the context-aware neural conversation model may add time,and event, etc., as input into the context-aware neural conversationalmodel.

FIG. 5 illustrates an examplary context-aware neural conversationalmodel consistent with disclosed embodiments. As shown in FIG. 5, eachtoekn in a sentence may be associated with a event-level representationv_(i)∈R^(k*1) . In standard SEQ2SEQ model, a sentence S may be encodedinto a vector representation h_(S) using the source LSTM. Then for eachstemp in the target side, hidden units may be obtained by combining therepresentation producted by the target LSTM at the previous time step,the word representations at the current time step, and the contextembedding v_(i).

The context-aware neural conversation model may add a hidden layer thatencodes the event information v_(i), making the response contextawareable. The embedding v_(i) may be shared across all conversationsthat involve event i. {v_(i)} may be learned by back propagating wordprediction errors to each neural component during training.

Moreover, the context-aware neural conversation model may be able toinfer answers to questions even if the evidence is not readily presentin the training set, which may be important as the training data may notcontain explicit information about every attribute of each user. Thecontext-aware neural conversation model may learn event representatiosbased on conversational content produced by different events, and eventsproducing similar respnses may tend to have similar embeddings. Thus,the training data nearby in the vector space may increase thegeneralization capability of the model.

For example, considering a question-answer pair “recommend some placefor fun” and “I think lake tahoe is good” which is generated in winterseason, the context-aware neural conversation model may add time,location, people and other contextual information as inputs in thetraining process, which may be embedded into the learning of restaruantrepresentations considering the contextual information. Then, the “laketahoe” may be a better answer for the winter season. In the testprocess, when a restaurant is asked in a question, “how about therestaurant B.J. in lake tahoe”, the context-aware neural conversationmodel may detect that this question is asked in summer season and mayrecommend a better result other than B.J. when noticing that “laketahoe” is not close to current context.

Then the step S408 may be convereted to find a response sentence or ananswer Y={y₁, y₂, . . . , y_(n)} to a given an input sentence X={x₁, x₂,. . . , x_(n)}, by taking the context EC={ec₁, ec₂, . . . , ec_(m)} intoconsideration, where x represents a word in the question, and yrepresents a word in the response. The problem of finding the responsesentence Y may be converted to predict y by maximizing the probability P(y_(t)|y_(t−1), . . . , y₁, ec). Neural network may be adopted to learnthe representation of sentences without applying handcraft rules.

A typical neural conversational model each time may provide eachsentence with an input gate, a memory gate, and an output gate, whichare respectively denoted as i_(t), f_(t), and o_(t). x_(t) denotes thevector for an individual text unit at time step t, h_(t) denotes thevector computed by the LSTM model at time step t by combining x_(t) andh_(t−1), c_(t) denotes the cell state vector at time step t, and θdenotes the sigmoid function. Then, the vector representation h_(t) foreach time step t is given by:

$\begin{matrix}{{〚\begin{matrix}i_{t} \\f_{t} \\o_{t} \\l_{t}\end{matrix}〛} = {{〚\begin{matrix}\theta \\\theta \\\theta \\\tanh\end{matrix}〛}W*{〚\begin{matrix}h_{t - 1} \\x_{t}^{s}\end{matrix}〛}}} & (1) \\{c_{t} = {{f_{t}\mspace{11mu} c_{t - 1}} + {i_{t}l_{t}}}} & (2) \\{h_{t}^{s} = {o_{t}*{\tanh \left( c_{t} \right)}}} & (3)\end{matrix}$

where

${W = {〚\begin{matrix}W_{i} \\W_{t} \\W_{o} \\W_{l}\end{matrix}〛}},$

where W denotes learned and trained factors, and W_(i), W_(f), W_(o),W_(l)∈R^(K*2K).

Different from the SEQ2SEQ generation task, each input X may be pairedwith a sequence of predicted outputs: Y={y₁, y₂, . . . , y_(n)}. Thedistribution over outputs and sequentially predicted tokens may beexpressed by a softmax function:

$\begin{matrix}{{p\left( Y \middle| X \right)} = {{\prod\limits_{t = 1}^{n_{y}}\; {p\left( {\left. y_{t} \middle| x_{1} \right.,x_{2},\ldots \mspace{11mu},x_{t},y_{1},\ldots \mspace{11mu},y_{t - 1}} \right)}} = {\prod\limits_{t = 1}^{n_{y}}\frac{\exp \left( {f\left( {h_{t - 1},e_{yt}} \right)} \right)}{\sum\limits_{y^{\prime}}\; {\exp \left( {f\left( {h_{t - 1},e_{y^{\prime}}} \right)} \right)}}}}} & (4)\end{matrix}$

where f (h_(t−1), e_(yt)) denotes an activation function between h_(t−1)and e_(yt) . Each sentence may be terminated with a specialend-of-sentence symbol EOS. Thus, during decoding, the decodingalgorithm may be terminated when an EOS token is predicted. At each timestep, either a greedy approach or beam search may be adopted for wordprediction.

After the answer to the question is generated, the answer is validatedby an evidence validation model (S410). Although the answers aregenerated, the user may not accept the answers. Thus, a confidence scorefor quality control may be provided. In one embodiment, the confidencescore may be calculated in normalized Kullback-Leibler distance (between0 and 1) between the question and the answer. The calculation ofKullback-Leibler distance is well known by those skilled in the art,thus, is not explained here.

For example, a predetermined confidence score may be provided as astandard, and whether the answer is valid or not is determined based onthe calculated confidence score of the answer (S411). If the calculatedconfidence score is larger than the predetermined confidence score, thecorresponding answer may be considered as valid and the valid answer isdelivered to the user (S412). If the calculated confidence score issmaller than the predetermined confidence score, the correspondinganswer may be considered as invalid, and steps S408 and S410 and S411may be repeated until the answer is determined as valid. In addition,the validated answers may also be used for training for the futurequestions.

The disclosed method and context-aware chatbot system may respond to theuser or answer questions by taking the contextual information intoconsideration. To realize a more accurate representation of question,answer and context, the contextual information may be input into thecontext-aware neural conversation model. That is, the contextualinformation may be input into the chat robot at a system level. Thecontext-aware neural conversation model may learn the contextualinformation and question answer pairs together. With the context-awareneural conversation model, the question answer pairs may be trainedwithout handcrafted rules, and the contextual information may bedecomposed into a plurality of high dimension vectors, such as people,and, organization, object, agent, occurrence, purpose, time, place, formof expression, concept/abstraction, and relationship, etc.

By analyzing the context in the questions, the user's question may bepaired with a better answer. That is, the chatbot may provide morerelevant responses to the users, and the users may find services andproducts they need in different contexts, significantly improving theuser experience. The disclosed method and context-aware chatbot systemmay be applied to various interesting applications without handcraftedrules.

In addition, the disclosed method and context-aware chatbot system mayprovide a general learning frame for methods and systems which have totake contextual information into consideration. The learned wordembedded presentation of context may be used for other tasks in future.The high dimension vectors representing the contextual information mayalso be used for personalization in recommender system in future.

Those of skill would further appreciate that the various illustrativemodules and method steps disclosed in the embodiments may be implementedas electronic hardware, computer software, or combinations of both. Toclearly illustrate this interchangeability of hardware and software,various illustrative units and steps have been described above generallyin terms of their functionality. Whether such functionality isimplemented as hardware or software depends upon the particularapplication and design constraints imposed on the overall system.Skilled artisans may implement the described functionality in varyingways for each particular application, but such implementation decisionsshould not be interpreted as causing a departure from the scope of thepresent invention.

The description of the disclosed embodiments is provided to illustratethe present invention to those skilled in the art. Various modificationsto these embodiments will be readily apparent to those skilled in theart, and the generic principles defined herein may be applied to otherembodiments without departing from the spirit or scope of the invention.Thus, the present invention is not intended to be limited to theembodiments shown herein but is to be accorded the widest scopeconsistent with the principles and novel features disclosed herein.

What is claimed is:
 1. A context-aware chatbot method, comprising: receiving a user's voice; converting the user's voice to a question to be answered; determining a question type of the question to be answered; generating at least one answer to the question based on a context-aware neural conversation model; validating the answer generated by the context-aware neural conversation model; and delivering the answer validated to the user, wherein the context-aware neural conversation model takes contextual information of the question into consideration, and decomposes the contextual information of the question into a plurality of high dimension vectors.
 2. The context-aware chatbot method according to claim 1, wherein determining a question type of the question to be answered further including: identifying a Lexical Answer Type (LAT) of the question to be answered.
 3. The context-aware chatbot method according to claim 1, wherein generating at least one answer to the question based on a context-aware neural conversation model further including: provided an input sentence X={x₁, x₂, . . . , x_(n)}, finding a response sentence Y={y₁, y₂, . . . , y_(n)} by taking a context EC={ec₁, ec₂, . . . , ec_(,)} into consideration, wherein x represents a word in the input sentence, y represents a word in the response sentence, the response sentence Y represents the answer, and the input sentence X represents the question to be answered.
 4. The context-aware chatbot method according to claim 3, wherein provided an input sentence X={x₁, x₂, . . . , x_(n)}, finding a response sentence Y={y₁, y₂, . . . , y_(n)} by taking a context EC={ec₁, ec₂, . . . , ec_(m)} into consideration further including: predicting y by maximizing a probability P (y_(t)|y_(t−1), . . . , y₁,ec).
 5. The context-aware chatbot method according to claim 4, wherein predicting y by maximizing a probability P (y_(t)|y_(t−1), . . . , y₁, ec) further including: providing the input sentence with an input gate i_(t), a memory gate f_(t), and an output gate o_(t), by the context-aware neural conversation model; and calculating a vector representation h_(t) for each time step t by: $\begin{matrix} {{〚\begin{matrix} i_{t} \\ f_{t} \\ o_{t} \\ l_{t} \end{matrix}〛} = {{〚\begin{matrix} \theta \\ \theta \\ \theta \\ \tanh \end{matrix}〛}W*{〚\begin{matrix} h_{t - 1} \\ x_{t}^{s} \end{matrix}〛}}} \\ {c_{t} = {{f_{t}\mspace{11mu} c_{t - 1}} + {i_{t}l_{t}}}} \\ {{h_{t}^{s} = {o_{t}*{\tanh \left( c_{t} \right)}}},} \end{matrix}$ where ${W = {〚\begin{matrix} W_{i} \\ W_{t} \\ W_{o} \\ W_{l} \end{matrix}〛}},$ where W_(i), W_(f), W_(o), W_(l)∈R^(K*2K), W denotes learned and trained factors, x_(t) denotes a vector representation for an individual word at time step t, h_(t) denotes a vector representation computed by Long Short Term Memory (LSTM) model at the time step t by combining x_(t) and h_(t−1), c_(t) denotes a cell state vector representation at time step t, and θ denotes a sigmoid function.
 6. The context-aware chatbot method according to claim 5, further including: calculating a distribution over outputs and sequentially predicted tokens based on a softmax function: ${{p\left( Y \middle| X \right)} = {{\prod\limits_{t = 1}^{n_{y}}\; {p\left( {\left. y_{t} \middle| x_{1} \right.,x_{2},\ldots \mspace{11mu},x_{t},y_{1},\ldots \mspace{11mu},y_{t - 1}} \right)}} = {\prod\limits_{t = 1}^{n_{y}}\frac{\exp \left( {f\left( {h_{t - 1},e_{yt}} \right)} \right)}{\sum\limits_{y^{\prime}}\; {\exp \left( {f\left( {h_{t - 1},e_{y^{\prime}}} \right)} \right)}}}}},$ where f (h_(t−1), e_(yt)) denotes an activation function between h_(t−1) and e_(yt) .
 7. The context-aware chatbot method according to claim 6, wherein: terminating a decoding of the input sentence when an EOS token is predicted.
 8. The context-aware chatbot method according to claim 1, wherein validating the answer generated by the context-aware neural conversation model further including: calculating a confidence score for the answer generated by the context-aware neural conversation model, wherein the confidence score is a normalized Kullback-Leibler distance between the question and the answer.
 9. A non-transitory computer-readable medium having computer program for, when being executed by a processor, performing a context-aware chatbot method, the method comprising: receiving a user's voice; converting the user's voice to a question to be answered; determining a question type of the question to be answered; generating at least one answer to the question based on a context-aware neural conversation model; validating the answer generated by the context-aware neural conversation model; and delivering the answer validated to the user, wherein the context-aware neural conversation model takes contextual information of the question into consideration, and decomposes the contextual information of the question into a plurality of high dimension vectors.
 10. The non-transitory computer-readable medium according to claim 9, wherein determining a question type of the question to be answered further including: identifying a Lexical Answer Type (LAT) of the question to be answered.
 11. The non-transitory computer-readable medium according to claim 9, wherein generating at least one answer to the question based on a context-aware neural conversation model further including: given an input sentence X={x₁, x₂, . . . , x_(n)}, finding a response sentence Y={y₁, y₂, . . . , y_(n)} by taking a context EC={ec₁, ec₂, . . . , ec_(m)} into consideration, where x represents a word in the input sentence, y represents a word in the response sentence, the response sentence Y represents the answer, and the input sentence X represents the question to be answered.
 12. The non-transitory computer-readable medium according to claim 11, wherein given an input sentence X={x₁, x₂, . . . , x_(n)}, finding a response sentence Y={y₁, y₂, . . . , y_(n)} by taking a context EC={ec₁, ec₂, . . . , ec_(m)} into consideration further including: predicting y by maximizing a probability P (y_(t)|y_(t−1), . . . , y₁, ec).
 13. The non-transitory computer-readable medium according to claim 12, wherein predicting y by maximizing a probability P (y_(t)|y_(t−1), . . . , y₁, ec) further including: providing the input sentence with an input gate i_(t), a memory gate f_(t), and an output gate o_(t), by the context-aware neural conversation model; calculating a vector representation h_(t) for each time step t by: $\begin{matrix} {{〚\begin{matrix} i_{t} \\ f_{t} \\ o_{t} \\ l_{t} \end{matrix}〛} = {{〚\begin{matrix} \theta \\ \theta \\ \theta \\ \tanh \end{matrix}〛}W*{〚\begin{matrix} h_{t - 1} \\ x_{t}^{s} \end{matrix}〛}}} \\ {c_{t} = {{f_{t}\mspace{11mu} c_{t - 1}} + {i_{t}l_{t}}}} \\ {{h_{t}^{s} = {o_{t}*{\tanh \left( c_{t} \right)}}},} \end{matrix}$ where ${W = {〚\begin{matrix} W_{i} \\ W_{t} \\ W_{o} \\ W_{l} \end{matrix}〛}},$ where W_(i), W_(f), W_(o), W_(l)∈R^(K*2K), W denotes learned and trained factors, x_(t) denotes a vector representation for an individual word at time step t, h_(t) denotes a vector representation computed by Long Short Term Memory (LSTM) model at the time step t by combining x_(t) and h_(t−1), c_(t) denotes a cell state vector representation at time step t, and θ denotes a sigmoid function, and calculating a distribution over outputs and sequentially predicted tokens based on a softmax function ${{p\left( Y \middle| X \right)} = {{\prod\limits_{t = 1}^{n_{y}}\; {p\left( {\left. y_{t} \middle| x_{1} \right.,x_{2},\ldots \mspace{11mu},x_{t},y_{1},\ldots \mspace{11mu},y_{t - 1}} \right)}} = {\prod\limits_{t = 1}^{n_{y}}\frac{\exp \left( {f\left( {h_{t - 1},e_{yt}} \right)} \right)}{\sum\limits_{y^{\prime}}\; {\exp \left( {f\left( {h_{t - 1},e_{y^{\prime}}} \right)} \right)}}}}},$ where f (h_(t−1), e_(yt)) denotes an activation function between h_(t−1) and e_(yt).
 14. The non-transitory computer-readable medium according to claim 9, wherein validating the answer generated by the context-aware neural conversation model further including: calculating a confidence score for the answer generated by the context-aware neural conversation model, wherein the confidence score is a normalized Kullback-Leibler distance between the question and the answer.
 15. A context-aware chatbot system, comprising: a question acquisition module configured to receive a user's voice and convert the user's voice to a question to be answered; a question determination module configured to determine a question type of the question to be answered; a context-aware neural conversation module configured to generate at least one answer to the question by taking contextual information of the question into consideration and decomposing the contextual information of the question into a plurality of high dimension vectors; an evidence validation module configured to validate the answer generated by the context-aware neural conversation model; and an answer delivery module configured to deliver the answer validated to the user.
 16. The context-aware chatbot system according to claim 15, wherein the question determination module is configured to: identify a Lexical Answer Type (LAT) of the question to be answered.
 17. The context-aware chatbot system according to claim 15, wherein the context-aware neural conversation module is configured to: given an input sentence X={x₁, x₂, . . . , x_(n)}, find a response sentence Y={y₁, y₂, . . . , y_(n)} by taking a context EC={ec₁, ec₂, . . . , ec_(m)} into consideration, where x represents a word in the input sentence, y represents a word in the response sentence, the response sentence Y represents the answer, and the input sentence X represents the question to be answered.
 18. The context-aware chatbot system according to claim 17, wherein the context-aware neural conversation module is configured to: predict y by maximizing a probability P (y_(t),|y_(t−1), . . . , y₁, ec).
 19. The context-aware chatbot system according to claim 18, wherein the context-aware neural conversation module is configured to: provide the input sentence with an input gate i_(t), a memory gate f_(t), and an output gate o_(t), by the context-aware neural conversation model; calculate a vector representation h_(t) for each time step t by: $\begin{matrix} {{〚\begin{matrix} i_{t} \\ f_{t} \\ o_{t} \\ l_{t} \end{matrix}〛} = {{〚\begin{matrix} \theta \\ \theta \\ \theta \\ \tanh \end{matrix}〛}W*{〚\begin{matrix} h_{t - 1} \\ x_{t}^{s} \end{matrix}〛}}} \\ {c_{t} = {{f_{t}\mspace{11mu} c_{t - 1}} + {i_{t}l_{t}}}} \\ {{h_{t}^{s} = {o_{t}*{\tanh \left( c_{t} \right)}}},} \end{matrix}$ where ${W = {〚\begin{matrix} W_{i} \\ W_{t} \\ W_{o} \\ W_{l} \end{matrix}〛}},$ where W_(i), W_(f), W_(o), W_(l)∈R^(K*2K),W denotes learned and trained factors, x_(t) denotes a vector representation for an individual word at time step t, h_(t) denotes a vector representation computed by Long Short Term Memory (LSTM) model at the time step t by combining x_(t) and h_(t−1), c_(t) denotes a cell state vector representation at time step t, and θ denotes a sigmoid function, and calculate a distribution over outputs and sequentially predicted tokens based on a softmax function ${{p\left( Y \middle| X \right)} = {{\prod\limits_{t = 1}^{n_{y}}\; {p\left( {\left. y_{t} \middle| x_{1} \right.,x_{2},\ldots \mspace{11mu},x_{t},y_{1},\ldots \mspace{11mu},y_{t - 1}} \right)}} = {\prod\limits_{t = 1}^{n_{y}}\frac{\exp \left( {f\left( {h_{t - 1},e_{yt}} \right)} \right)}{\sum\limits_{y^{\prime}}\; {\exp \left( {f\left( {h_{t - 1},e_{y^{\prime}}} \right)} \right)}}}}},$ where f (h_(t−1), e_(yt)) denotes an activation function between h_(t−1) and e_(yt).
 20. The context-aware chatbot system according to claim 15, wherein the evidence validation module is further configured to: calculate a confidence score for the answer generated by the context-aware neural conversation model, wherein the confidence score is a normalized Kullback-Leibler distance between the question and the answer. 